22 research outputs found

    Recent Advances of Deep Learning in Bioinformatics and Computational Biology

    Get PDF
    Extracting inherent valuable knowledge from omics big data remains as a daunting problem in bioinformatics and computational biology. Deep learning, as an emerging branch from machine learning, has exhibited unprecedented performance in quite a few applications from academia and industry. We highlight the difference and similarity in widely utilized models in deep learning studies, through discussing their basic structures, and reviewing diverse applications and disadvantages. We anticipate the work can serve as a meaningful perspective for further development of its theory, algorithm and application in bioinformatic and computational biology

    Cross-cell DNA methylation annotation and analysis for pan-cancer study

    Get PDF
    Pan-cancer study can uncover cell- and tissue-specific genomic loci and regions with underlying biological functions, as one of fundamental procedures toward precision medicine. We utilized the online curated resource of DNA methylation annotation knowledgebase, to implement the cross-cell interrogation of pan-cancer study of breast cancer. The study revealed genome-wide differentially-methylated loci and regions by the reduced representation bisulfite sequencing profiling. The knowledgebase contains three level of curated information across multiple cancer and normal cells from the ENCODE Consortium. The reference base covers all identified differentially-methylation CpG sites and regions of interest, further annotated gene information, together with tumor suppressor gene and methylation level. Lastly, it includes the inferred functional association network and related Gene Ontology analysis results based on all the tumor suppressor genes identified from the differentially-methylated regions of interest. Our knowledgebase and analysis results provide a thorough reference source for biomedical researchers and clinicians. The cross-cell analysis results are deposited at: http://github.com/gladex/DMAK.

    Microglia-Specific Promoter Activities of HEXB Gene

    Get PDF
    Adeno-associated virus (AAV)-mediated genetic targeting of microglia remains a challenge. Overcoming this hurdle is essential for gene editing in the central nervous system (CNS). Here, we characterized the minimal/native promoter of the HEXB gene, which is known to be specifically and stably expressed in the microglia during homeostatic and pathological conditions. Dual reporter and serial deletion assays identified the critical role of the natural 5’ untranslated region (−97 bp related to the first ATG) in driving transcriptional activity of the mouse Hexb gene. The native promoter region of mouse, human, and monkey HEXB are located at −135, −134, and −170 bp to the first ATG, respectively. These promoters were highly active and specific in microglia with strong cross-species transcriptional activities, but did not exhibit activity in primary astrocytes. In addition, we identified a 135 bp promoter of CD68 gene that was highly active in microglia but not in astrocytes. Considering that HEXB is specifically expressed in microglia, these data suggest that the newly characterized microglia-specific HEXB minimal/native promoter can be an ideal candidate for microglia-targeting AAV gene therapy in the CNS

    ChIP-seq Defined Genome-Wide Map of TGFβ/SMAD4 Targets: Implications with Clinical Outcome of Ovarian Cancer

    Get PDF
    Deregulation of the transforming growth factor-β (TGFβ) signaling pathway in epithelial ovarian cancer has been reported, but the precise mechanism underlying disrupted TGFβ signaling in the disease remains unclear. We performed chromatin immunoprecipitation followed by sequencing (ChIP-seq) to investigate genome-wide screening of TGFβ-induced SMAD4 binding in epithelial ovarian cancer. Following TGFβ stimulation of the A2780 epithelial ovarian cancer cell line, we identified 2,362 SMAD4 binding loci and 318 differentially expressed SMAD4 target genes. Comprehensive examination of SMAD4-bound loci, revealed four distinct binding patterns: 1) Basal; 2) Shift; 3) Stimulated Only; 4) Unstimulated Only. TGFβ stimulated SMAD4-bound loci were primarily classified as either Stimulated only (74%) or Shift (25%), indicating that TGFβ-stimulation alters SMAD4 binding patterns in epithelial ovarian cancer cells. Furthermore, based on gene regulatory network analysis, we determined that the TGFβ-induced, SMAD4-dependent regulatory network was strikingly different in ovarian cancer compared to normal cells. Importantly, the TGFβ/SMAD4 target genes identified in the A2780 epithelial ovarian cancer cell line were predictive of patient survival, based on in silico mining of publically available patient data bases. In conclusion, our data highlight the utility of next generation sequencing technology to identify genome-wide SMAD4 target genes in epithelial ovarian cancer and link aberrant TGFβ/SMAD signaling to ovarian tumorigenesis. Furthermore, the identified SMAD4 binding loci, combined with gene expression profiling and in silico data mining of patient cohorts, may provide a powerful approach to determine potential gene signatures with biological and future translational research in ovarian and other cancers

    META2: Intercellular DNA Methylation Pairwise Annotation and Integrative Analysis

    No full text
    Genome-wide deciphering intercellular differential DNA methylation as well as its roles in transcriptional regulation remains elusive in cancer epigenetics. Here we developed a toolkit META2 for DNA methylation annotation and analysis, which aims to perform integrative analysis on differentially methylated loci and regions through deep mining and statistical comparison methods. META2 contains multiple versatile functions for investigating and annotating DNA methylation profiles. Benchmarked with T-47D cell, we interrogated the association within differentially methylated CpG (DMC) and region (DMR) candidate count and region length and identified major transition zones as clues for inferring statistically significant DMRs; together we validated those DMRs with the functional annotation. Thus META2 can provide a comprehensive analysis approach for epigenetic research and clinical study

    Advances in Genomic Profiling and Analysis of 3D Chromatin Structure and Interaction

    No full text
    Recent sequence-based profiling technologies such as high-throughput sequencing to detect fragment nucleotide sequence (Hi-C) and chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) have revolutionized the field of three-dimensional (3D) chromatin architecture. It is now recognized that human genome functions as folded 3D chromatin units and looping paradigm is the basic principle of gene regulation. To better interpret the 3D data dramatically accumulating in past five years and to gain deep biological insights, huge efforts have been made in developing novel quantitative analysis methods. However, the full understanding of genome regulation requires thorough knowledge in both genomic technologies and their related data analyses. We summarize the recent advances in genomic technologies in identifying the 3D chromatin structure and interaction, and illustrate the quantitative analysis methods to infer functional domains and chromatin interactions, and further elucidate the emerging single-cell Hi-C technique and its computational analysis, and finally discuss the future directions such as advances of 3D chromatin techniques in diseases

    Cross-cell DNA methylation annotation and analysis for pan-cancer study

    No full text

    COPAR: A ChIP-Seq Optimal Peak Analyzer

    No full text
    Sequencing data quality and peak alignment efficiency of ChIP-sequencing profiles are directly related to the reliability and reproducibility of NGS experiments. Till now, there is no tool specifically designed for optimal peak alignment estimation and quality-related genomic feature extraction for ChIP-sequencing profiles. We developed open-sourced COPAR, a user-friendly package, to statistically investigate, quantify, and visualize the optimal peak alignment and inherent genomic features using ChIP-seq data from NGS experiments. It provides a versatile perspective for biologists to perform quality-check for high-throughput experiments and optimize their experiment design. The package COPAR can process mapped ChIP-seq read file in BED format and output statistically sound results for multiple high-throughput experiments. Together with three public ChIP-seq data sets verified with the developed package, we have deposited COPAR on GitHub under a GNU GPL license
    corecore